Bridging the Language Gap: Topic Adaptation for Documents with Different Technicality
نویسندگان
چکیده
The language-gap, for example between lowliteracy laypersons and highly-technical expert documents, is a fundamental barrier for cross-domain knowledge transfer. This paper seeks to close the gap at the thematic level via topic adaptation, i.e., adjusting the topical structures for cross-domain documents according to a domain factor such as technicality. We present a probabilistic model for this purpose based on joint modeling of topic and technicality. The proposed τLDA model explicitly encodes the interplay between topic and technicality hierarchies, providing an effective topic-level bridge between lay and expert documents. We demonstrate the usefulness of τLDA with an application to consumer medical informatics.
منابع مشابه
Cross border E-Science and Research Partnership: Bridging the Gap Between Science and Media
E-Science is a tool that helps scientists to store, interpret, analyze and make a network of their data, and it can play a critical role in different aspects of the scientific goals and research. This commentary, under the topic of Cross Border E-Science and Research Partnership: Bridging the Gap between Science and Media,[1] attempts to shed light on E-Science with emphasis on three importa...
متن کاملPOSTECH at NTCIR-5 Patent Retrieval: Smoothing Experiments in a Language Modeling Approach to Patent Retrieval
This report describes the experimental results of our participation at the Document Retrieval Subtask of NTCIR-5 Patent Retrieval Task. Unlike newspaper articles which belong to the main document type handled in previous information retrieval experiments, patent documents have many different characteristics in terms of length, technicality, structureness, etc. Among these, we focus on the lengt...
متن کاملBayesian Bridging Topic Models for Classification
We study the problem of constructing the topic-based model over different domains for text classification. In real-world applications, there are abundant unlabeled documents but sparse labeled documents. It is challenging to construct a reliable and adaptive model to classify a large amount of documents containing different domains. The classifiers trained from a source domain shall perform poo...
متن کاملCauses of the Gap between Junior High School Intended, Implemented, and Attained Curricula and Ways of Bridging It
Causes of the Gap between Junior High School Intended, Implemented, and Attained Curricula and Ways of Bridging It M.A. Jamaalifar* S. Sh. HaashemiMoghadam, Ph.D.** Z. Aabedi Karajibaan, Ph.D.*** A.R. Faghihi, Ph.D.**** To identify the causes of the perceived gap between junior high school intended, implemented, and attained curricula, a group of 30 curriculum planners, 50 educationa...
متن کاملApplied Cognitive Psychology
The adaptation of vocabulary between communication partners, i.e. the lexical entrainment phenomenon, is well documented. This study investigates whether the phenomenon can also be found in computer-mediated communication between experts and laypersons. The respondents, who are medical experts (n1⁄4 46), answered to fictitious patients’ queries on health problems. Language technicality within p...
متن کامل